Goto

Collaborating Authors

 building height


GraphCSVAE: Graph Categorical Structured Variational Autoencoder for Spatiotemporal Auditing of Physical Vulnerability Towards Sustainable Post-Disaster Risk Reduction

Dimasaka, Joshua, Geiß, Christian, Muir-Wood, Robert, So, Emily

arXiv.org Artificial Intelligence

In the aftermath of disasters, many institutions worldwide face challenges in continually monitoring changes in disaster risk, limiting the ability of key decision-makers to assess progress towards the UN Sendai Framework for Disaster Risk Reduction 2015-2030. While numerous efforts have substantially advanced the large-scale modeling of hazard and exposure through Earth observation and data-driven methods, progress remains limited in modeling another equally important yet challenging element of the risk equation: physical vulnerability. To address this gap, we introduce Graph Categorical Structured Variational Autoencoder (GraphCSVAE), a novel probabilistic data-driven framework for modeling physical vulnerability by integrating deep learning, graph representation, and categorical probabilistic inference, using time-series satellite-derived datasets and prior expert belief systems. We introduce a weakly supervised first-order transition matrix that reflects the changes in the spatiotemporal distribution of physical vulnerability in two disaster-stricken and socioeconomically disadvantaged areas: (1) the cyclone-impacted coastal Khurushkul community in Bangladesh and (2) the mudslide-affected city of Freetown in Sierra Leone. Our work reveals post-disaster regional dynamics in physical vulnerability, offering valuable insights into localized spatiotemporal auditing and sustainable strategies for post-disaster risk reduction.


LOD1 3D City Model from LiDAR: The Impact of Segmentation Accuracy on Quality of Urban 3D Modeling and Morphology Extraction

Chajaei, Fatemeh, Bagheri, Hossein

arXiv.org Artificial Intelligence

Three-dimensional reconstruction of buildings, particularly at Level of Detail 1 (LOD1), plays a crucial role in various applications such as urban planning, urban environmental studies, and designing optimized transportation networks. This study focuses on assessing the potential of LiDAR data for accurate 3D building reconstruction at LOD1 and extracting morphological features from these models. Four deep semantic segmentation models, U-Net, Attention U-Net, U-Net3+, and DeepLabV3+, were used, applying transfer learning to extract building footprints from LiDAR data. The results showed that U-Net3+ and Attention U-Net outperformed the others, achieving IoU scores of 0.833 and 0.814, respectively. Various statistical measures, including maximum, range, mode, median, and the 90th percentile, were used to estimate building heights, resulting in the generation of 3D models at LOD1. As the main contribution of the research, the impact of segmentation accuracy on the quality of 3D building modeling and the accuracy of morphological features like building area and external wall surface area was investigated. The results showed that the accuracy of building identification (segmentation performance) significantly affects the 3D model quality and the estimation of morphological features, depending on the height calculation method. Overall, the UNet3+ method, utilizing the 90th percentile and median measures, leads to accurate height estimation of buildings and the extraction of morphological features.


An AI-driven framework for rapid and localized optimizations of urban open spaces

Eshraghi, Pegah, Dehnavi, Arman Nikkhah, Mirdamadi, Maedeh, Talami, Riccardo, Zomorodian, Zahra-Sadat

arXiv.org Artificial Intelligence

As urbanization accelerates, open spaces are increasingly recognized for their role in enhancing sustainability and well-being, yet they remain underexplored compared to built spaces. This study introduces an AI-driven framework that integrates machine learning models (MLMs) and explainable AI techniques to optimize Sky View Factor (SVF) and visibility, key spatial metrics influencing thermal comfort and perceived safety in urban spaces. Unlike global optimization methods, which are computationally intensive and impractical for localized adjustments, this framework supports incremental design improvements with lower computational costs and greater flexibility. The framework employs SHapley Adaptive Explanations (SHAP) to analyze feature importance and Counterfactual Explanations (CFXs) to propose minimal design changes. Simulations tested five MLMs, identifying XGBoost as the most accurate, with building width, park area, and heights of surrounding buildings as critical for SVF, and distances from southern buildings as key for visibility. Compared to Genetic Algorithms, which required approximately 15/30 minutes across 3/4 generations to converge, the tested CFX approach achieved optimized results in 1 minute with a 5% RMSE error, demonstrating significantly faster performance and suitability for scalable retrofitting strategies. This interpretable and computationally efficient framework advances urban performance optimization, providing data-driven insights and practical retrofitting solutions for enhancing usability and environmental quality across diverse urban contexts.


Adopting Explainable-AI to investigate the impact of urban morphology design on energy and environmental performance in dry-arid climates

Eshraghi, Pegah, Talami, Riccardo, Dehnavi, Arman Nikkhah, Mirdamadi, Maedeh, Zomorodian, Zahra-Sadat

arXiv.org Artificial Intelligence

In rapidly urbanizing regions, designing climate-responsive urban forms is crucial for sustainable development, especially in dry arid-climates where urban morphology has a significant impact on energy consumption and environmental performance. This study advances urban morphology evaluation by combining Urban Building Energy Modeling (UBEM) with machine learning methods (ML) and Explainable AI techniques, specifically Shapley Additive Explanations (SHAP). Using Tehran's dense urban landscape as a case study, this research assesses and ranks the impact of 30 morphology parameters at the urban block level on key energy metrics (cooling, heating, and lighting demand) and environmental performance (sunlight exposure, photovoltaic generation, and Sky View Factor). Among seven ML algorithms evaluated, the XGBoost model was the most effective predictor, achieving high accuracy (R2: 0.92) and a training time of 3.64 seconds. Findings reveal that building shape, window-to-wall ratio, and commercial ratio are the most critical parameters affecting energy efficiency, while the heights and distances of neighboring buildings strongly influence cooling demand and solar access. By evaluating urban blocks with varied densities and configurations, this study offers generalizable insights applicable to other dry-arid regions. Moreover, the integration of UBEM and Explainable AI offers a scalable, data-driven framework for developing climate-responsive urban designs adaptable to high-density environments worldwide.


StreetviewLLM: Extracting Geographic Information Using a Chain-of-Thought Multimodal Large Language Model

Li, Zongrong, Xu, Junhao, Wang, Siqin, Wu, Yifan, Li, Haiyang

arXiv.org Artificial Intelligence

Traditional machine learning has played a key role in geospatial predictions, but its limitations have become more distinct over time. One significant drawback of traditional ML is that they often rely on structured geospatial data, such as raster or vector formats, affecting their ability to handle unstructured or multimodal data (Pierdicca & Paolanti, 2022). Additionally, traditional models may face challenges in capturing complex spatial patterns and regional variations, leading to challenges with data sparsity and uneven distribution, which could affect the accuracy and generalizability of predictions (Nikparvar & Thill, 2021). In contrast, large language models (LLMs) have shown great promise across various fields by processing vast amounts of data and reasoning across multiple modalities (Chang et al., 2024). By integrating textual, visual, and contextual information, LLMs can introduce novel covariates for geospatial predictions, thus enhancing traditional approaches. However, extracting geospatial knowledge from LLMs poses its challenges. Although using geographic coordinates (i.e., latitude and longitude) was a straightforward way to retrieve location-specific information, this approach often yields suboptimal results, particularly when dealing with complex spatial relationships and regional characteristics. As a result, the traditional model does not easily to harness the full potential of multi-modal data, hindering its effectiveness in applications demanding comprehensive, cross-modal insights.


Estimate the building height at a 10-meter resolution based on Sentinel data

Yan, Xin

arXiv.org Artificial Intelligence

Building height is an important indicator for scientific research and practical application. However, building height products with a high spatial resolution (10m) are still very scarce. To meet the needs of high-resolution building height estimation models, this study established a set of spatial-spectral-temporal feature databases, combining SAR data provided by Sentinel-1, optical data provided by Sentinel-2, and shape data provided by building footprints. The statistical indicators on the time scale are extracted to form a rich database of 160 features. This study combined with permutation feature importance, Shapley Additive Explanations, and Random Forest variable importance, and the final stable features are obtained through an expert scoring system. This study took 12 large, medium, and small cities in the United States as the training data. It used moving windows to aggregate the pixels to solve the impact of SAR image displacement and building shadows. This study built a building height model based on a random forest model and compared three model ensemble methods of bagging, boosting, and stacking. To evaluate the accuracy of the prediction results, this study collected Lidar data in the test area, and the evaluation results showed that its R-Square reached 0.78, which can prove that the building height can be obtained effectively. The fast production of high-resolution building height data can support large-scale scientific research and application in many fields.


Semi-supervised Learning from Street-View Images and OpenStreetMap for Automatic Building Height Estimation

Li, Hao, Yuan, Zhendong, Dax, Gabriel, Kong, Gefei, Fan, Hongchao, Zipf, Alexander, Werner, Martin

arXiv.org Artificial Intelligence

Accurate building height estimation is key to the automatic derivation of 3D city models from emerging big geospatial data, including Volunteered Geographical Information (VGI). However, an automatic solution for large-scale building height estimation based on low-cost VGI data is currently missing. The fast development of VGI data platforms, especially OpenStreetMap (OSM) and crowdsourced street-view images (SVI), offers a stimulating opportunity to fill this research gap. In this work, we propose a semi-supervised learning (SSL) method of automatically estimating building height from Mapillary SVI and OSM data to generate low-cost and open-source 3D city modeling in LoD1. The proposed method consists of three parts: first, we propose an SSL schema with the option of setting a different ratio of "pseudo label" during the supervised regression; second, we extract multi-level morphometric features from OSM data (i.e., buildings and streets) for the purposed of inferring building height; last, we design a building floor estimation workflow with a pre-trained facade object detection network to generate "pseudo label" from SVI and assign it to the corresponding OSM building footprint. In a case study, we validate the proposed SSL method in the city of Heidelberg, Germany and evaluate the model performance against the reference data of building heights. Based on three different regression models, namely Random Forest (RF), Support Vector Machine (SVM), and Convolutional Neural Network (CNN), the SSL method leads to a clear performance boosting in estimating building heights with a Mean Absolute Error (MAE) around 2.1 meters, which is competitive to state-of-the-art approaches. The preliminary result is promising and motivates our future work in scaling up the proposed method based on low-cost VGI data, with possibilities in even regions and areas with diverse data quality and availability.


A CNN regression model to estimate buildings height maps using Sentinel-1 SAR and Sentinel-2 MSI time series

Yadav, Ritu, Nascetti, Andrea, Ban, Yifang

arXiv.org Artificial Intelligence

Accurate estimation of building heights is essential for urban planning, infrastructure management, and environmental analysis. In this study, we propose a supervised Multimodal Building Height Regression Network (MBHR-Net) for estimating building heights at 10m spatial resolution using Sentinel-1 (S1) and Sentinel-2 (S2) satellite time series. S1 provides Synthetic Aperture Radar (SAR) data that offers valuable information on building structures, while S2 provides multispectral data that is sensitive to different land cover types, vegetation phenology, and building shadows. Our MBHR-Net aims to extract meaningful features from the S1 and S2 images to learn complex spatio-temporal relationships between image patterns and building heights. The model is trained and tested in 10 cities in the Netherlands. Root Mean Squared Error (RMSE), Intersection over Union (IOU), and R-squared (R2) score metrics are used to evaluate the performance of the model. The preliminary results (3.73m RMSE, 0.95 IoU, 0.61 R2) demonstrate the effectiveness of our deep learning model in accurately estimating building heights, showcasing its potential for urban planning, environmental impact analysis, and other related applications.


Building Floorspace in China: A Dataset and Learning Pipeline

Egger, Peter, Rao, Susie Xi, Papini, Sebastiano

arXiv.org Artificial Intelligence

This paper provides a first milestone in measuring the floorspace of buildings (that is, building footprint and height) for 40 major Chinese cities. The intent is to maximize city coverage and, eventually provide longitudinal data. Doing so requires building on imagery that is of a medium-fine-grained granularity, as larger cross sections of cities and longer time series for them are only available in such format. We use a multi-task object segmenter approach to learn the building footprint and height in the same framework in parallel: (1) we determine the surface area is covered by any buildings (the square footage of occupied land); (2) we determine floorspace from multi-image representations of buildings from various angles to determine the height of buildings. We use Sentinel-1 and -2 satellite images as our main data source. The benefits of these data are their large cross-sectional and longitudinal scope plus their unrestricted accessibility. We provide a detailed description of our data, algorithms, and evaluations. In addition, we analyze the quality of reference data and their role for measuring the building floorspace with minimal error. We conduct extensive quantitative and qualitative analyses with Shenzhen as a case study using our multi-task learner. Finally, we conduct correlation studies between our results (on both pixel and aggregated urban area levels) and nightlight data to gauge the merits of our approach in studying urban development. Our data and codebase are publicly accessible under https://gitlab.ethz.ch/raox/urban-satellite-public-v2.


CBHE: Corner-based Building Height Estimation for Complex Street Scene Images

Zhao, Yunxiang, Qi, Jianzhong, Zhang, Rui

arXiv.org Artificial Intelligence

Building height estimation is important in many applications such as 3D city reconstruction, urban planning, and navigation. Recently, a new building height estimation method using street scene images and 2D maps was proposed. This method is more scalable than traditional methods that use high-resolution optical data, LiDAR data, or RADAR data which are expensive to obtain. The method needs to detect building rooflines and then compute building height via the pinhole camera model. We observe that this method has limitations in handling complex street scene images in which buildings overlap with each other and the rooflines are difficult to locate. We propose CBHE, a building height estimation algorithm considering both building corners and rooflines. CBHE first obtains building corner and roofline candidates in street scene images based on building footprints from 2D maps and the camera parameters. Then, we use a deep neural network named BuildingNet to classify and filter corner and roofline candidates. Based on the valid corners and rooflines from BuildingNet, CBHE computes building height via the pinhole camera model. Experimental results show that the proposed BuildingNet yields a higher accuracy on building corner and roofline candidate filtering compared with the state-of-the-art open set classifiers. Meanwhile, CBHE outperforms the baseline algorithm by over 10% in building height estimation accuracy.